Reflections and Outcomes from the harp Training Course 2024

Andrew Singleton [MET Norway]

What is harp?

  • A framework for (exploratory) data analysis in R

  • Verification and visualizations are the main built in applications

  • All data has the same format no matter the source

    • grib (1/2)
    • FA
    • NetCDF
    • vfld / vobs
    • OBSOUL
    • HDF5 (Opera)
    • SQLite

The traing course

4 - 8 March 2024 :: Dublin, Ireland

Participants

18 in person

18 online

Instructors

Andrew Singleton [MET Norway]

Alex Deckmyn [RMI]

Presenters

James Fannon (Met Éireann)

Carlos Peralta (DMI)

Polly Schmederer (GeoSphere Austria)

Samuel Viana (AEMET)

Juanje Gonzalez (AEMET)

Feedback Received

5 in person

4 online

Closed Questions

Attendance

Was the course useful?

How was the level?

How was the pace?

Open Questions

  • Was there anything that was a total game changer for you? (e.g. something you didn’t know before, but will make your work much easier)

  • Was there anything that was especially bad? (e.g. really badly explained, really boring, really irrelevant)

  • Any other comments

Topics

Getting started

Installing harp

Setting up a project

Data for the course

Basic reading of data

read_grid() to read individual files

read_forecast() to read multiple files

harp’s Data Classes

Interpolation to points at read time

Lagged ensembles

Comments

  • The parameter handling and scaling (e.g. converting temperature from Kelvin to °C) is much easier now

  • Being able to get wind speed and direction from U and V is really helpful

Point Verification

Steps in the workflow - RRJV(S/P) 1

RRJV

read_point_forecast()

read_point_obs()

join_to_fcst()

det/ens_verify()

S/P

plot_point_verif()

save_point_verif()

shiny_plot_point_verif()

Point Verification

Steps in the workflow - RRJV(S/P) 1

Observation errors

Grouped verification

Vertical profiles

Conditional verification

Comments

  • It was my first time seeing conditional verification, which will be very useful for analysing performance in specific circumstances

  • Stratifying verification by wind direction will be especially useful, as long as you know what the direction in degrees means on the compass ;-)

Building a verification script

Basic Skeleton

Adding the code

Scaling and observation errors

Non Standard evaluation (NSE)

Non Standard evaluation (NSE)

harp uses NSE for a smoother ride for interactive running

Causes problems for using variables

Embracing variables with {{ }}

Building a verification script

Basic Skeleton

Adding the code

Scaling and observation errors

Non Standard evaluation (NSE)

Defining parameters

Looping over parameters with for loops

Functional programming with walk()

harp scripts for UWC-West, ACCORD and DEODE

Comments

  • ‘Build a Script’ was particularly useful as I am not an R expert and the good practises discussed here were very useful

  • Managed to code the basics script for point verification for a new task at our institute during the session on point verification

Spatial Verification

Workflow for spatial verification

Choosing a verification grid & regridding

Neighbourhood Contingency Tables

HIRA

Point verification like workflow for Ensemble FSS

Using dFSS for upscaling probabilistic forecasts

Introduction to the panelification tool

Comments

Good

Spatial verification was something that I had not worked within HARP of course, so it was useful to identify ways it can easily replace some existing tools

Not so good

It would be great if the spatial part was provided in a tidier manner. It is usually the case that the point verification and data ingestion part is prepared before hand and then the code is posted online every day. This does not happen with the spatial part, and there are always last minute developments that would be committed during the week . Also, using the development version in the course might not be the best practice in my opinion, since this version is changing and the functions might change. Would it not be better to use the current stable version instead?

Plotting Spatial data

Basic plotting with plot_field() and plot_domain()

Where to find colour palettes

Using ggplot() with geom_geomraster()

Faceting and formatting titles

Tricks for improving plotting speed

geom_georaster()

ggplot geom for plotting georeferenced rasters

ggplot(t2m, aes(geofield = fcst)) + geom_georaster()

geom_georaster()

map <- get_map(precip_1h$fcst)
ggplot() + 
  geom_polygon(aes(x, y, group = group), data = map, fill = "grey") +
  geom_georaster(
    aes(geofield = fcst), data = precip_1h,
    upscale_method = "downsample", upscale_factor = 4
  ) +
  geom_polygon(
    aes(x, y, group = group), data = map,
    fill = "transparent", colour = "grey20"
  ) + 
  scale_fill_scico(
    "mm", palette = "oslo", trans = "log", direction = -1,
    limits = c(0.125, 64), breaks = seq_double(0.125, 10),
    na.value = "transparent"
  ) +
  facet_wrap(~member, nrow = 2) +
  coord_equal(expand = FALSE) + 
  theme_harp_map()

geom_georaster()

Comments

  • The spatial data handling in harp was very impressive and I’ll be using it much more for visualisation going forward

  • Good Getting more familiar with the visualization capabilities of harp

Writing a data reading function

Interface to MET Norway Frost database

  • Required inputs and outputs

  • Not so easy to wing it!

  • Familiarity with the data format is most important

Dealing with local grib tables

Alternative strategies

grib_opts(
  param_find = list(param = use_grib_*())
)
  • Can be dangerous for grib2

  • Copy local definitions to Rgrib2 library directory

  • Send local definitions to Alex for inclusion in Rgrib2

  • Send local definitions to ECMWF for inclusion in eccodes

Contributing to harp

  • Demonstration of updating a plotting function

  • Github - fork -> feature branch -> pull -> merge -> push -> PR

General feedback

Good

  • The sum total of many new not very big things new to me has made a lot of difference

  • It was great to hear about new packages, functions and methods that I wasn’t aware of. One tends to use what has worked in the past - so good to “be upgraded”

  • The courses and training were very helpful, especially with keeping video captures as well as the tests. Even though I didn’t know before that we could conduct single-day assessments, whereas previously it was a minimum of 7 days

Not so good

  • It was complicated to reproduce the examples at the same time they were explained (it is not only a matter of typing more slowly!). And it is interesting to do so it if you want to try some of those examples with your own files and check quickly if everything works in your case. It would help to have the examples in advance to let you make copy-paste (with make the pertinent modifications, of course)

  • Although I know that harp is in constant development, in my opinion, the course should be mostly based on the stable release of harp. When using the development branches of harp it is quite uncertain whether what you learnt (and maybe put to work operationally) will remain exactly the same in the next reference version

  • I gave up trying to code with Andrew and Alex, but rather used the time to follow and understand. Having everything online afterwards is great, because now I can go back through everything in the quiet of my office

General Comments

  • Although harp has a great potential of use, my knowledge of all harp packages (included external ones) is very limited and takes time to learn. For this reason, and in my particular case, I find it difficult to be able to do things very different from the examples you are showing or to use harp apart from the combination script & configuration_file

  • Apart from the “lessons” you upload from the course, I do not know if it would be a good idea to sort out some of the examples in a kind of “How to” section. To have a cheat sheet for a quick reference to the functions is another idea

  • It is a great idea to record the presentations and make them available online. I hope this is done in future courses as well!

Website

The future

Short-term Priorities

  • Maps in shiny app

  • Sub hourly lead times for all file formats

  • Comparator for thresholds

  • Quantile thresholds

  • All spatial scores to have methods allowing same workflow as point verification

  • On the fly computation in shiny app

  • Refactoring verification functions for faster computation with many groups

  • Cheatsheets

  • Publication

  • The Book of harp 1st Ed.

Longer-term Prospects

  • How do we make best use of cloud storage data, e.g. S3?

  • Interfaces to parquet, feather / arrow, duckDB

  • Interfaces to ODB (1? / 2?) / BUFR

  • Ingestion of satellite data

  • We would love for people to join the harp development team

  • Collaboration requires commitment - nothing can be achieved working on something unfamiliar in 0.5 or 1 pm.

  • Andrew and Alex are happy to guide people around the code

  • If not R, C++ preferred, but existing python algorithms could be adopted

  • D3 (faster, more interactive) graphics in shiny app?

  • Verification output in SQLite / JSON / parquet / arrow

  • Replace shiny with observable framework?

Thanks to our hosts at Met Éireann!